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ABSTRACT 


This  note  presents  a  tutorial  survey  of  the  mathematics  that  is  used  in  the 
study  of  linear  predictive  filtering  as  applied  to  the  analysis  and  synthesis  of 
speecn.  Speech  is  modelled  as  the  output  of  an  all-pole  filter  that  is  driver  by 
either  a  periodic  pulse  train  or  white  noise.  A  minimum-mean-squared-error 
technique  for  estimating  the  coefficients  of  this  filter  from  speech  data  is 
presented.  This  technique  leads  to  a  set  of  equations  for  the  coefficient 
estimates  which  can  be  solved  by  a  computationally  efficient  recursive  technique 
known  as  Levinson's  method. 

The  filter  derived  by  the  above  mentioned  technique  can  be  realized  by 
any  standard  technique;  however,  a  particularly  interesting  realization  is  in 
terms  of  a  digital  simulation  of  a  non-uniform  acoustic  tube.  It  is  shown 
that  any  stable  all-pole  filter  can  be  realized  as  an  acoustic  tube  and,  moreover, 
that  the  Levinson  recursion  produces  as  a  by-product  exactly  the  reflection 
coefficients  needed  for  such  a  realization. 

The  report  concludes  by  showing  how  the  classical  theory  of  orthogonal 
polynom.'Ms  can  be  applied  to  the  speech  analysis/synthesis  problem  and  used 
to  derive  many  of  the  results  obtained  above  by  other  me  ns. 
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INTRODUCTION 


The  purpose  of  this  note  is  to  present  a  tutorial  discussion  of  the  mathematical 
theory  underlying  the  analysis  and  synthesis  of  speech  by  means  of  linear 
predictive  filtering.  None  of  the  results  presented  here  are  nev^  all  having 
appeared  either  in  the  literature  or  in  research  reports.  The  main  reason 
for  the  present  note  is  to  present  these  scattered  results  from  a  unified  stand¬ 
point  and,  in  some  cases,  to  provide  more  detail  than  is  available  in  the 
literature. 

The  basic  speech  problem  under  consideration  can  be  formulated  as 
follows.*  Samples  of  a  speech  waveform  are  modelled  as  being  the  output  of 
a  digital  filter  that  has  been  excited  by  either  a  series  of  equally  spat  co  pulses 
or  white  noise  depending  whether  the  speech  is  voiced  or  unvoiced.  The  filter 
is  described  by  the  difference  equation 


8  =  Z  a.  s  +  u 

n  ^  k  n-k  n 


(1) 


where  u^  denotes  the  nm  sample  of  the  excitation  and  sn  denotes  the  n  sample  of 
speech.  The  filter  order  p  is  assumed  to  be  known  on  the  basis  of  other 
considerations.  The  transfer  function  of  this  filter  is  easily  seen  to  be  [Hp(z)|  1 


where 


P 

H  (z)  =  1  -  I  a.  z 
P  V 


-k 


(2) 


from  which  It  is  apparent  that  | H p(z )J  *  is  an  all-pole  filter.  The  problem  at  hand 

is  to  use  samples  of  real  speech  to  arrive  at  an  estimate  of  the  filter  coefficients 

tij,  and  then  to  use  these  coefficients  to  synthesize  a  filter  that  could  be 

used  to  regenerate  the  original  speech.  The  latter  operation  requires  a  knowledge 

of  whether  the  original  speech  was  voiced  or  unvoiced  but  the  problem  of  how 

to  obtain  this  information  is  not  the  concern  of  the  present  work. 

There  are  many  ways  one  could  go  about  estimating  the  filter  coefficients 
from  the  speech  samples.  The  particular  method  that  will  be  considered  in  this 


*  This  section  is  based  on  references  1,  2,  6,  8,  9. 


note  's  a  minimv.m-mean-squared  error  technique  that  now  will  be 
described. 

Select  a  group  of  N+l  speech  samples  which,  for  convenience,  will 
be  numbered  from  n  =  0  to  n  =  M.  Define  a  sequence  sn  by 


s  = 

n 


(3) 


speech  sample  0  sn  sN 

0  n  <0,  n  >N 

and  define  the  mean-squared  prediction  error  by 

“  [  P  ,  2 

e  =  £  s  -  £  a.  s  , 

„  l  n  ,  ,  k  n  -k 

n=-»  k=l  1  (4) 

The  quantity  e  is  a  function  of  the  assumed  values  for  the  a^s.  The 
desired  estimate  for  the  a^'s  is  obtained  by  choosing  those  values  that  yield 
a  minimum  value  of  e. 

This  problem  can  be  solved  by  first  expanding  equation  (4)  as  follows* 

2  P  P 

e  =  £s  -  2  I  a,  £  s  s  ,  +  £  a.  a.  £s  ,s 

n  n  k=i  K  n  n  n-k  k  =Jk  j  n  n-k  n-j 


P  P 

=  R  -  2  £  a  R  +  £  a.  a .  R 

k=l  k  k,  j-1  k  J  k'J 


(5) 


where  the  autocorrelation  unction  R,  is  defined  by 


R,  =  R  ,  -  £  s  s 

k  -k  n  n  n-k 


(6) 


It  will  be  convenient  to  rewrite  equation  (5)  in  matrix  form  as  follows: 

(7) 


e  =  Rq  -  2Jr+ Jrs 


where 


— T  =  [ai  - -aPI 

rT  *[R,.R2....Rp| 


(8) 


All  sums  without  limits  will  henceforth  be  assumed  to  run  from  n  =  coto  n 


and  the  correlation  matrix  R  has  as  its  (i,j)th  element  Ri-j  Note  that 
because  R  is  a  correlation  matrix,  it  is  positive  definite  and,  therefore, 
non -singular. 

Completing  the  square  in  equation  (7)  yields  the  result, 


e  -  (a  -  R'1  r)T  R  (a  -  R_1  r)  +  Rq  -  rT  R  l  r  (9) 

Eouation  (9)  may  be  verified  simply  by  multiplying  out  the  quadratic  form  and 
cancelling  the  appropriate  terms.  The  desired  minimization  can  now  be 
performed  by  noting  that  since  R  is  positive  definite  the  minimum  value  of  the 
quadratic  form  in  equation  (9)  is  zero  and  can  be  achieved  by  setting  a  equal 
to  a^  where 

a(p)  =  R_1t  00) 


The  resulting  minimum  e  is  given  by 


=  >) 


min 


R  -  vl  R_1x 
R0  -  rT  a<p> 


°  ~P  " 


(P) 


R  -  2  a'1'' R. 
0  k=l  k  k 


(ID 


The  use  of  the  superscript  p  to  denote  the  minimizing  a^'s  and  em.n  may 
seem  peculiar  but  the  reason  for  this  notation  will  become  apparent  in  the  next 
section. 

Equation  (10)  expresses  the  solution  to  a  set  of  linear  equations  in  matrix 
notation.  In  ordinary  notation,  the  equations  to  which  equation  (10)  is  the 
solution  are 

R.  -  2  a(p)R.  .  =  0  (12) 

i  k=1  K  i-K 


1  -  1  »  •  •  •  »  P 


These  equations, called  the  autocorrelation  normal  equations,  will  play 


THE  LF VINSON  RECURSION 


The  autocorrelation  normal  equations  (12)  can  be  solved  in  a  recursive 

way  by  means  of  a  technique  known  as  Levinson's  method*  To  derive  this 

technique,  first  assume  that  the  solution  to  the  n  order  autocorrelation 

normal  ^-niations  is  known  and  denote  it  by  a^  ,  k  =  1, . .  .n.  Next,  write 
st 

down  the  n  +  1  order  equations  in  the  form 

n 


R.  -  Z  aft*1')  R.  ,  -  a^H+^R. 
i  k_j  k  l-k  n+1  i-n-1 


=-  0 


R 


n+1 


I  a 
k=l 


i  =  1 , . .  .n. 

(n+1)  _  (n+1)  _ 

Kn+l-k  a  n+1  Ro  ‘  0 


(13) 


A  neat  way  of  getting  at  the  Lev  inson  recursion  is  to  assume  a  solution  to  (13) 
of  the  form 


,  (n+1 )  _ 


a<n>  -  b, 
k  k 


,  k  -  1 , , . .n. 


(H) 


„(n+l) 


with  an+1  to  be  determined  later.  Substitution  of  (14)  into  the  first 
n  of  equations  (1?)  leads  to  the  new  equation 


I  b,  R.  . 
k=i k  ‘-k 


t"  R.  .  =  0 

n+1  i-n-1 


(15) 


i  1 , . . .  n. 

Motivated  by  the  fact  that  equations  (15)  look  very  much  like  the  nth  order 
autocorrelation  normal  equations,  the  change  of  variable  j  =  i-n-1  is  made  with 
the  result 

,(n+l) 


n 

a'  .*/  R.  -  Z  b.  R.  .  ,  =  0 

n+l  j  jc_  j  k  j-fk-n-1 


(16) 


j  1 , . . ,n. 

Next,  the  change  of  variable  f  =  n  +  1  -  k  is  made  and  (16)  becomes 


-(n+1) 

an+l 


Rj  ■  ch  Vi-f  Rj-t  =  ° 

j  =  1 , . . ,n. 
th 


(17) 


Since,  equations  (17)  are  a  scaled  version  of  the  nC  order  autocorrelation  normal 
equations  their  solution  is  evidently  given  by, 


See  reference  7. 


h  ,  ,  =  a*"!1*  a.,n> 
n+l-f  rH-1  £ 


C  —  1 , . . .n. 


(18) 


and,  therefore 


,(n+l) 


(n)  .  (n+1)  (n) 

k  n+1  n+l-k 


It  only  remains  to  see  if  a  value  of  a^1^  can  be  found  such  that 
the  last  remaining  equation  in  the  set  (13 )  can  be  satisfied.  Using  (19), 
this  equation  now  reads , 


(Ki 


R 


n+1 


n 

v 


k -1 


(n)  _  (n+1)  (n) 

k  an+l  an+l -k 


K 


i  i+l  *  k 


This  equation  can  be  solved  for  a  (n+1) 

n+1 


with  the  result 


a'"?'  R 

n+1  o 


(20) 


-  0 


“S'*  =  K  = 

n+!  n 


R  -  Z  a(n)R 
n+1  kJjak  n+1  -k 


11  /  \ 

R  -  Z  a.n  R, 
o  ,  .  k  k 


k=l  "  (21) 

Tltis  result  is  meaningful  as  long  as  the  denominator  is  not  zero;  however, 

the  denominator  is  exactly  equal  to  the  minimum  mean  squared  error  for  the 

n1*1  stage  of  the  process,  as  given  by  equation  (1 1).  However,  can  never 

be  zero,  for  if  it  were,  it  would  follow  that  s  =  ~  a,  s  ,  for  all  n.  Since  s  -0 

n  jlj  k  n-k  n 

for  n-^0,  this  equation  implies  that  s  =0  for  all  n.  Since  this  case  nc^er  arises 
in  practice,  it  follows  th  t  equation  (21)  is  always  meaningful. 


The  only  ingredient  missing  to  set  this  recursive  process  in  motion  is 
a  solution  to  the  first  order  system  and  this  can  lx  written  down  by  inspection 
of  (12)  as 


(1)  R1 
a  =  K  =  -! 
1  o  R 

o 


(22) 


For  later  considerations,  it  will  be  useful  to  rewrite  the  Levinson  recursion 
in  terms  of  the  inverse  filter  transfer  function  Hp(z)  instead  of  in  terms  of  the 
coefficients  aj^  as  given  by  equations  (19)  and  (21).  This  recursion  is  easily 


5 


seen  to  be  given  by, 


Hm-l(z)  =  Hn(z)  '  Knz'(n+1)  ^(z'1)  (23) 

with  Kn  being  determined  by  the  Rk's  via  equation  (21).  The  initial  condition 
for  .'23 )  is  given  by 

R1  -1 

H1  (z)  -  1  -  z  (24) 

o 

It  is  evident  from  equation  (22)  that  |  Kq  |  <  1  and  it  turns  out  that  this 
is  true  for  for  all  n.  Since  this  fact  will  be  vital  in  the  sequel  it  will  be  proved 

now. 

To  this  end,  it  will  be  necessary  to  rewrite  equation  (21)in  the  z-transform 
domain  by  making  use  of  the  easily  verified  identity. 


R, 


2  s  s  . 
n  n  n-k 


I 


e-l™!  S«J™,  I  df 


(25) 


whe  <i  S(z)  denotes  the  z-transform  of  the  speech  samples 


S(z)  =  Z  s 


-n 


(26) 


In  order  to  simplify  notation,  equation  (25)  will  be  rewritten  as 


R. 


f 


J 


S(z) 


df 


(27) 


where  the  convention  in  force  here  and  in  the  sequel  is  that  all  integrals  have 
limits  (-£,  |)  and  whenever  the  variable  z  appears  under  an  integral  sign,  it  is 
understood  to  be  equal  to  ^ 
rewritten  in  the  form 


Equation  (21)  which  defines  K  now  can  be 

n 


K 


/H 


,  "(n+1) 


n 

V 


/ 


S(z) 


_ k=l 

n  (n)  -k 
1  -  I  a.  z 
k=i  k 


-(n+l-k) 

df 

'  1 

df 

6 


(28) 


=  f  |f(z)  z  '(n4'1'  Hn(z_I)  df 

f  I  S<z'ij  H  (z)  df 
J  I  in 

Since  the  denominator  of  this  equation  is  the  minimum  mean  squared  error,  it 
follows  that, 

e("’  J  ISM  I  ’  Hn  (z)  df  (29) 

A  recursion  for  e^  can  easily  he  derived  by  writing 
c<n+l>  -f  |S(z.|f  Hn+1(7)  df 

'  / 1  S<2’(S|Hn(z'  ‘  Kn  7''n*"  Hn  <7''M  df 

=  c(n)  -  K  f  S(z)  ;  z'(r>+1)H  (z‘!)  df 

n  J  n 

(30) 

where  the  last  step  follows  from  equation  (28)- 

Since  e^  mu  t  always  be  positive,  it  follows  from  the  last  equation  that 

K  <■  1  as  advertised. 

1  n  • 

As  an  important  application  of  the  result  that  j  K p  <  1  it  will  be  shown 
that  all  the  zeros  of  H  (z)  lie  strictly  inside  the  unit  circle,  which  implies 

"  I  I  ’’ 

that,  the  speech  synthesis  filters  [H^z)  j  will  always  be  stable.  The 

proof  proceeds  by  induction  by  first  noting  that  because  a  correlation  function 

is  always  maximum  at  the  origin,  R.  <  R  it  follows  that  H  (z)  as 

defined  by  eq  ,ation  (24),  has  its  zero  in&ide  die  ur.it  circle.  Next,  assume  that 

Hn(z)  has  its  n  zeros  inside  the  unit  circle.  Multiplying  equation  (23)  by  z,Hl  and 

noting  that,  on  the  unit  circle  zn+1Hn(z)J  =  Hn<z  *  | ,  it  follows  from  Rouchc’s 

theorem  that  zn  +  ^H  ,(z)  and  zn+  Si  (z)  have  equal  numbers  of  zeros 
n+  1  n 

inside  the  unit  circle.  Since  zn  H  (z)  has  n+l  zeros  inside  the  unit  c'rcle 

n 

the  proof  of  the  statement  follows  by  induction. 

*  Reference  10,  p.  116. 


7 


The  Nonuniform  Acoustic  Tube 


Figure  1  depicts  three  sections  of  a  nonuniform  acoustic  rube.  The 
cross-sectional  area  of  the  nC^  section  is  An  and  the  length  of  all  sections 
is  A.  The  forward  and  backward  components  of  the  volume  velocity  measured 
at  the  left-hand  end  of  the  nC^  section  are  sampled  every  2A/c  seconds  and  the 
7-transforms  of  these  samples  are  denoted  by  V*  (z)  and  Vn  (z).  The  constant 
c  denotes  the  velocity  of  sound  in  the  tube. 

th  th 

The  relationship  between  the  volume  velocities  in  the  n  n  and  nHcn 

sections  can  be  determined  by  writing  down  the  continuity  equations  for  volume 

th  th 

velocity  and  acoustic  pressure  at  the  boundary  between  the  n  n  and  n+i  n  sections. 

The  z-transforms  of  the  forward  and  backward  volume  velocities  measured 
at  the  right-hand  end  of  the  n^  section  are  given  by  z  (z)  and  z*  (z) 
respectively.  The  continuity  of  volume  velocity  can  now  be  expressed  by  the  equation 


Ci  <z>  •  Vi(z>  -  z'fv>>  -  ziVz>  <31> 

Since  the  acoustic  impedence  of  the  n**1  section  is  given  by  pc/A^  where  p 
denotes  the  density  of  air,  the  continuity  of  acoustic  pressure  is  expressed 
by  the  equation. 


tt—  k> +  ViH  ■  s-  I  z'*v> +  z'vz>l 

n+i  n 

(32) 

These  equations  can  be  solved  for  (z)  and  V  j  (z)  with  the  result, 

Vn+l(z)  ’  Hi1-  [’’H  <z>  •  r„zlvn  (z)  1 

"  .  (33) 

Vl<z>  -TFF-  Kz‘‘v>>  +  z'v;  <z>  ] 

where  the  reflection  coefficient  r  is  defined  by 

n  7 

A  -  A  , 

„  n  n+1 

rn  ~K  +  A  .  (34) 

n  n+1 


This  section  is  based  primarily  on  reference  1 .  Note  carefully  that  the  numoering 
of  the  tube  sections  differs  from  that  in  ref.  1  in  that  n  here  corresponds  to  Wakita* 
M-n. 


In  matrix  form  these  equations  read, 


Vn+1  <z> 


Vi  <z> 


z-t 

1+F 


n 


-r  z  V  (z) 
n  r  n 


u  -r 


Vn  <2> 


(35) 


Equation  (35)  can  be  inverted  easily  with  the  result, 

1 


v>> 


V  (z) 
n 


T 


r  (z) 

n  r  n+1  '  ' 


-1  -1 J 

r  z  z 
n 


V.  <2> 


«6) 


These  equations  can  be  conveniently  normalized  by  introducing  the 
quantities 


n 

1 


U  (z)  =  - r 

n  '  n-1 


tt 


i-1 


n 

7 


(37) 


Un(z)  =  n-1 


^  (1-r) 
i=l  1 


in  terms  of  which  equation  <36)  becomes: 


r  Un  <z> 


L  un  (z) 


LV 


1  z-1 


u+  (z) 

r  n+1  v  '  - 


L  um-i  <2>  J 


(38) 


Tiie  quantities  Un  (z)  and  Un  (z)  can  be  interpreted  as  the  forward  and  backward 


components  of  volume  velocity  in  a  fictitious  acoustic  tube  which  differs 

n-1 

from  the  real  tube  only  in  that  a  gain  factor  n  (1-r.)  and  an  overall 

i=l  1 

tn 

delay  z \T  have  been  removed. 


Equation  (38)  can  be  used  to  derive  a  digital  network  whose  response  is  the  same  as  that 
of  the  acoustic  tube.  To  accomplish  this,  equation  (38)  is  first  rewritten  in  the 


(39) 


Vt <Z>  =  °n  (Z)  -  rnUn+l  <z> 

Un  <2>  *  2"  |  rnVt  <2>  +  Vl'  <2>] 

The  digital  network  that  is  generated  by  equation  (39)  is  shown  in  Figure  2. 

This  network  as  drawn  is  incomplete  because  no  termination  has  been  specified 
thus  making  it  impossible  to  compute  the  sequence  of  backward  going  waves. 

As  an  example  of  a  termination  (one  that  will  play  a  role  in  the  sequel)  assume 
the  end  of  the  tube  is  connected  to  a  tube  of  infinite  cross-section  and  of  infinite 
length  i.e. ,  free  space  filled  with  air.  This  means  that  the  final  reflection  coefficient 
is  -1  and  that  there  is  no  backward  wave  at  the  output.  The  network  for  this 
arrangement  is  shown  in  Figure  3  with  the  inputs  to  the  network  being  the  output 
of  an  N-section  acoustic  tube. 

The  next  order  of  business  is  to  compute  the  transfer  function  of  an 

N-section  acoustic  tube.  This  will  be  done  for  the  tube  termination  depicted  in 

Figure  3  which  implies  that  UQUt(z)  -  (z).  Since  equation  (38)  enables 

one  to  recursively  compute  the  z- transforms  of  the  forward  and  backward  waves 
th 

in  the  n  section  of  the  tube  in  terms  of  their  counterparts  in  the  n+lst  section  it  is 
natural  to  assume  a  simple  output  z-tran:  form  and  then  compute  the  input  z-transform 
Uj  (z)  that  produced  this  output.  If  1  is  assumed,  then  it  follows  that 

VZ>  =  *  =  2  ’  Equation  (38)  is  now  employed  N  times  to  arrive 

at  Uj  (z)  and  it  follows  that  the  tube's  transfer  function  is 


T(z) 


U  .  (z) 
out  ’ 

Uj  (z) 


-1 


(40) 


The  computation  just  described  is  related  to  the  Levinson  recursion  in 
a  very  important  way.  To  make  this  fact  clear,  the  Levinson  recursion  must 
be  rewritten  by  introducing  the  :  actions  G+  (z)  and  G'  (z)  defined  by 


Gn  <z>  =  H„  <z> 

G-(z)  - 


(41) 
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In  terms  of  these  functions,  the  Levinson  recursion,  equation  (23),  can  be  written 
as  a  set  of  two  recursions  as  follows: 


G+. ,  (z)  =  G+  (z)  +  Kn  G’  (z) 


n+1 


C  „  (z)  =  z 


•1 


Kn  Gn  <Z>  +  G;  <z> 


or,  in  matrix  form, 


(42) 


+ 1 
o 

» 

II 

r  1  K  1 

n 

f  G+(z)  -1 
n 

l  Gr;+1  <Z>  . 

K  z’1  z"1 
n 

The  initial  condition  for  the  recursion  is  now 


(43) 


Go(2>  -  1 

(441  , 

G‘(z)  =  -z'1 

A  comparison  of  equations  (42)  and  (38)  reveals  that  these  two  recursions 
are  identical  in  form  except  that  the  indexing  of  the  two  are  reversed,  i.e. , 
the  acoustic  tube  indexing  is  from  n  =  N  to  n  =  1  but  the  Lednson  recursion 
indexes  from  n  =  1  to  n  =  N,  Moreover,  comparison  of  equation  (44)  and  the 
initial  conditions  used  for  computing  the  acoustic  tube's  transfer  function  shows 
that  these  are  also  identical.  What  this  all  means  is  th^c  an  acoustic  tube 
with  reflection  coefficients  given  by  rN  n  =  K  j  hf.s  a  transfer  function  given  by 

T(z)  =  [HN-1<z> 

In  other  words,  since  the  Levinson  recursion  yields  the  best  estimate  of  the  filter 
inverse  to  the  filter  that  produced  the  original  speech  samples,  the  acoustic  tube 
filter  discussed  above  has  a  transfer  function  that  is  an  estimate  of  the  filter 
that  originally  produced  the  speech.  Thus,  this  acoustic  tube  filter  is  a  natural 
candidate  fo"  a  filter  to  synthesize  speech. 


-1 

(45) 


Atal  (reference  8)  has  given  a  different  derivation  of  the  transfer  function  of 
a  nonuniform  acoustic  tube.  His  derivation  leads  to  the  transfer  function  given 
by  equation  (45)  however,  his  acoustic  tube  differs  from  the  one  derived  abo\e 
mainly  in  that  the  input  and  output  terminals  are  interchanged.  In  other  words, 
the  reflection  coefficient  Kj  which  appears  at  the  output  end  of  the  acoustic  tube 
derived  above,  appears  at  the  input  end  of  Atal’s  acoustic  tube.  Mathematically 
there  does  not  seem  to  be  any  reason  to  choose  one  of  these  acoustic  tubes  over 
the  other  since  they  have  identical  transfer  functions,  however  Wakita's 

tube  seems  more  natural  as  a  model  of  the  vocal  tract.  This  follows  from 
the  fact  that  Wakita's  output  termination  is  an  Infinite  cross  section  tube 
which  appears  correct  for  modelling  the  interface  between  the  lips  and  the 
outside  world. 

It  has  now  been  demonstrated  how  speech  data  can  be  used  to  derive 

a  set  of  filter  coefficients  a,  and  a  set  of  reflection  coefficients  K  . 

K  n 

The  former  could  be  used  in  a  direct -form  realization  of  a  speech  synthesis 

filter  whereas  the  latter  could  be  used  to  synthesize  an  acoustic  tube  synthesis 

filter.  Which  of  these  realizations  is  better  is  stil  a  topic  for  investigation. 

For  the  sake  of  completeness,  this  section  will  conclude  by  showing  how 

an  arbitrary,  stable  all-pole  filter  Hn(z)  _1  ,  can  be  realized  as  an 

L 

acoustic  tube. 

The  basic  tool  for  this  demonstration  is  the  so-called  backward  Levinson 
recursion  which  can  be  derived  from  the  forward  Levinson  recursion,  equation  (23) 
as  follows.  Solving  equation  (23)  for  Hn(z)  yields  the  relation, 
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H„<'»  -  Vl  <z>  +  Kn  z'<n+1)  Hn<z“> 

Next  set  z  =  z  *  In  equation  (23)  and  solve  again  for  Kn  Hn(z)  with  the 


(46) 


result: 


K  H  (z) 
n  n 


-  *'<n+1>  [Hn<z'‘>  '  H„+l<z'1) 

f  vf  /_  1\  l- /  A  *in/1 


(47) 


The  elimination  of  H^z’1)  between  equations  (46)  and  (47)  leads  to  the 


desired  result: 


Hn(z>  ■ 


1 


!-Kn 


H„+l<z>  +  Knz'<IH'1>lin+l<z'1) 


Since  the  constant  term  in  Hn(z)  is  unity,  it  follows  from  equation  (48) 


that 


a'  .  K_  =  -  z  (n+1)  H 


Jn+lJ 

n+1' 


n+1 


(z) 


z=0 


(49) 


(48) 


Let  H  (z)  denote  an  arbitrary  Nth  order  polynomial  in  z  1  with  constant 

M 

tenn  equal  to  unity.  Furthermore,  assume  that  all  the  zeros  of  H  (z)  lie  strictly 

1 

1  is  the  transfer  function  of  a  stable,  all 


inside  the  uni*  circle  so  that 


yz> 


pole  filter.  Since  all  the  zeros  of  H  (z)  are  inside  the  unit  circle  and  since  the 

4  ’t 

coefficient  of  z‘N  in  H  (z)  is  the  product  of  all  the  zeros  of  H^z),  it  follows  that 

N  n 

K  as  given  by  equation  (49)  satisfies  K  <  l . 

IS  1  N> 


Assume  next,  that  the  backward  Levinson  recursion,  equation  (48), 
has  been  implemented  n  times  and  that  |  ^  |  <  1  and  that  the  polynomial 

H.,  (z)  has  a  constant  term  equal  to  unity  and  that  all  its  zeros  lie  inside  the 

unit  circle.  It  now  follows  from  an  application  of  Rouche’s  theorem  that 
ri^  n  (z)  as  given  by  equation  (48)  has  all  of  its  zeros  inside  the  unit  circle 
and,  therefore,  that  I  K*.  I  <  1  .  The  details  of  this  argument  will  not  be 
given  here  because  they  are  virtually  identical  to  those  given  earlier  when  it 
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was  shown  that  the  forward  Levinson  recursion  leads  to  stable  filters  as  long 
as  the  Xn's  used  satisfy  |  j  <  1.  It  now  follows  by  induction  that  all 
the  ’s  produced  by  the  backward  Levinson  recursion  equations  (48)  and  (49) 
satisfy  | Kn  |  <  1  as  long  as  the  starting  polynomial  (z)  had  all  of  its  zeros 
inside  the  unit  circle. 

Since  it  is  obvious  that  a  forward  Levinson  recursion  using  the  Kn's  derived 

from  a  backward  Levinson  recursion  will  yield  back  the  starting  polynomial 

H^(z),  it  follows  from  the  discussion  earlier  in  this  section  that  a  properly 

terminated  acoustic  tube  having  these  K  's  as  reflection  coefficients  will  have 

r  n 

a  transfer  function  given  by  (z)  1  .  It  has  thus  been  shown  how  an 

arbitrary,  stable  all-pole  filter  can  be  realized  as  an  acoustic  tube. 

The  Orthogonal  Polynomial  Approach 

The  theory  that  has  been  presented  is  complete  in  itself,  however,  it 
should  be  pointed  out  that  the  results  that  have  been  derived  are  often  arrived  at 
in  the  literature  by  a  completely  different  path  making  use  of  the  theory  of  polynomials 
orthogonal  on  the  ’init  circle*.  The  details  of  this  alternate  approach  will  now 
be  presented.  The  first  part  of  this  section  will  deal  exclusively  with  the  theory 
of  these  polynomials  with  the  connection  to  the  speech  problem  being  made  later. 


This  section  is  based  on  references  3,4  and  5. 


A  weighting  function  w(z)  is  defined  to  be  any  function  that  satisfies 
w(z)  £  0  on  the  unit  circle  and  in  addition,  satisfies 

J  w(z)  df  >  0  (50) 

A  finite  or  infinite  set  of  polynomials, 


n 

T  (z)  =  I  a  z 
w  k=0  nk 


k 


n  =  0,  1 , . . . 


(51) 


is  said  to  be  orthogonal  with  respect  to  the  weighting  function  w(z)  on  the  unit 
circle  if 

i  n  0,  1 , • • « 


a)  a  >0 
nn 


b) 


jf  „  <z>  ¥>m  <z>  w(z>  df  =  6nm 


(52) 


In  equation  (52),  the  overbar  denotes  complex  conjugation  and  6  nm  the  Kroneker 
delta. 

It  will  now  be  shown  that,  given  any  weighting  function,  there  exists  a  set 
of  polynomials  satisfying  conditions  a)  and  b).  The  proof  will  proceed  by  induction 
by  defining, 


,  X  -*■ 

(z)  =  c  *• 

ro  '  o 


where 


:o  -  /' 


w(z)  dz 


(53) 


(54) 


The  set  of  polynomials  consisting  of  9o(z)  alone  obviously  satisfies  a)  and  b). 

Assume  now  that  a  set  of  N  polynomials  satisfying  a)  and  b)  has  been 
constructed  and  enlarge  this  set  by  one  by  defining 


%<z>  =  A 


N 


N-l 

2  a  9,  (z) 
k=0  K  K 


(55) 


where  A  and  the  a,  's  are  to  be  determined, 
k 
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It  follows  that 


/ 

[fV2 


Vz>*c  <z)  w(z)  df 


)  w(z)  df  -  a 


1 


f  =  ", ...  N-l  (56) 

It  is  now  obvious  from  equation  ( 56)  that  condition  b)  will  be  satisfied  by 
defining 

a , 


f 

A  = 


■  / 


(z)  w(z)  df 


N-l 


-h 


1  W2* 


k=0 


w(z)  df 


(57) 


The  last  equation  is  meaningful  only  if  the  integral  appearing  in  it  doesn't 

vanish  which  is  always  the  case  because  it  is  well  known  th;it  the  powers  of  z  form 

a  linearly  independent  set.  Finally,  if  the  positive  square  root  is  always  taken 

in  equation  (57),  it  follows  that  condition  a)  is  also  satisfied  by  the  enlarged 

set  of  polynomials.  The  proof  of  existence  is  complete. 

Next  it  will  be  shown  that  a  set  of  polynomials  satisfying  a)  and  b) 

is  unique.  Assume  the  contrary.  Then  there  exist  two  different  sets  of  polynomials 

<pn(z)  andp’n  (z)  both  satisfying  a)  and  b).  Next,  note  that  it  follows  from 

condition  b)  that  z11  can  be  written  as  a  linear  crmbination  of  (z),  P  (z). . 

n  n-l 

<pQ  (z).  (This  is  obvious  for  n  =  0  and  follows  by  a  simple  induction  for  the  other 
powers  of  z.)  This  fact  in  turn  implies  that 


/ 


Pn(z)  z  w(z)  df  =  0 

k  =  0,  1, ...  n-l  (58) 

Now,  because  there  are  two  sets  of  polynomials  satisfying  a)  and  b),  it  follows 
that  the  polynomial 

k 

_ /_ \  /_ \  n  /a*  »  v  n 

(59) 


P(z)  =  pn(z)  -  —  (P'_  (Z)  =  0 

n  k'  n 
n 

where  kfi  and  k^  denote  the  coefficient  of  zn  in  (z)  andOn'  (z)  respectively,  is  of 
degree  no  higher  than  n-l. 
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From  this  fact  and  equation  (58),  it  follows  that 

p  (z)  L  2  w(z)  df 


/  lp(z,t 


/K  ,  _ 

°n(z) - Y  (7)  P(z)  w(z)  df 

=  0  k° 

and,  therefore,  that  p(z)  =  0  which  implies  that 


(60) 


Pn  (z)  =  In  ^n(z) 

k’ 


(61) 


However,  k  =  k  because, 
n  n  ’ 


/ 1  <7> 


/I 


wfzi  df 


(z )  |  w(z)  df 


(62) 


ar.d  the  uniqueness  of  any  set  of  polynomials  s  atisfying  a)  and  b)  has  been 
established. 

It  is  now  possible  to  establUn  a  number  of  important  properties  of  orthogonal 
/polynomials.  The  first  of  these  is  the  fact  that  all  the  zeros  of  a  set  of  polynomials 
satisfying  a)  and  b)  lie  inside  the  unit  circle.  To  prove  this  fact,  let  z^  be  a  zero 
of  ^n(z);  (Zq)  =  0.  The  polynomial  c?^  (z)  Az  -  z^ls  then  of  degree 
n-1  and  it  follows  from  equ  it  it  l  (58)  that 


/°  n(z> 


°n<z> 


z  -  z 


0  J 


w(z)  df  =  0 


(63) 


Equation  (63)  can  easily  be  rewritten  in  the  form, 

2 


/ 


<z  -  V 


On(z) 


Z  -  Z 


0 


w(z)  df  =  0 


from  which  it  follows  that, 

/• 


'0 


°n(2) 
z  -  z 


0 


w(z)  df 


/O 

r? 


0„(z) 
z- 


0 


(64) 


(65) 


w(z )  df 


Since  z  =  z*l,  a  simple  application  of  the  Schwartz  inequality  to  equation  (65)  now 
shows  that  |  z^  |  <  1  where  the  strong  ineaur.lity  follows  from  the  fact  that 
z  is  not  proportional  to  unity  on  the  unit  circle.  This  proves  the  theorem. 

The  next  fact  to  be  established  provides  the  lii.y  between  the  theory 
of  orthogonal  polynomials  and  the  speech  problem  introduced  earlier.  The  property 
of  orthogonal  polynomials  that  accomplishes  this  is  embodied  in  the  statement 
that  on(z)  minimizes  the  integral 

,  2 

Pn(z)  I  w(z)  df  (66) 

where  the  minimum  is  taken  over  all  polynomials  of  the  form  p  (z)  =  zn  +  a  .z11’1  +...an. 

n  n-1  0 

_  2 

The  minimum  itself  is  k  whe*  k  denotes  the  coefficient  of  zn  in  P  (z). 

n  n  n  ' 

The  proof  of  this  statement  can  be  established  by  first  noting  that 

since  zn  can  be  written  as  a  linear  combination  of  f»  (z)  ,  P  , (z). . .  (z). 

~n  ’  n-1  0 

it  follows  that  any  Pn(7)  can  be  represented  as 

n 

Pn<z)  =  Z  V.  *.<z)  (67) 

n  k=0  K  K 

where  Vn  =  k^  in  order  to  force  the  coefficient  of  zn  in  Pn(7)  to  be  unity. 

Substitution  of  equation  (67)  in  equation  (66)  yields 

p„(z|  w<z» df  -  i  |  vk| 

k=0 

S|Vnf  1  <6S> 

However,  the  lower  bound  given  in  equation  (68)  can  be  achieved  by  setting  0, 

k  =  0, . .  .n-1  and  the  proof  of  the  minimization  property  of  orthogonal  polynomials 
follows. 


The  connection  to  the  speech  problem  now  follows  by  recalling  that  this  problem 
boiled  down  to  minimizing  the  mean-squared  error  given  by  equation  (4).  Using  Parscval's 
theorem,  this  equation  can  be  rewritten  in  the  z-transform  domain  with  the  result, 

C  '  I  |HP(zf  |  S(zf  df  (69) 
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where 


H  (z)  =  1  -  I  a  z 

p  k=l  K 


Since  |  zPJ  =  1  on  the  unit  circle  minimizing  the  integral  in  equation  (69) 

is  the  same  as  minimizing  the  integral  given  by 


1 1  zP  Hp  <*>|  2  I  S<2>  | 


But  zpHp(z)  is  a  p1*1  order  polynomial  with  lead  coefficient  unity  and  it  follows  fror- 

the  above  minimization  property  of  orthogonal  polynomials  that  the  minimum  of 
.  2 

(70)  is  given  by  and  is  achieved  when 

zp  Hp(z)  =  kp*  vp(z)  (72) 

th 

Here,  (z)  denotes  the  p  orthogonal  polynomial  with  respect  to  the  weighting 
function  given  by 

w(z)  =  |  S(z)|  (73) 

The  above  argument  has  transformed  the  speech  problem  under  consideration 
from  one  of  minimizing  a  certain  integral  to  one  of  finding  the  pth  order  orthogonal 
polynomial  with  respect  to  the  weighting  function  |s(z)  j  2  .  There  exisc  explicit 
expressions  for  the  polynomials  orthogonal  with  respect  to  an  arbitrary  weighting  function, 
however,  their  evaluation  requires  the  computation  of  large  determinants.  A 
computationally  more  attractive  approach  to  the  evaluation  of  the  coefficients  of 
Op(z)  is  available,  however,  because  of  the  existence  of  a  recursion  formula 
for  the  orthogonal  polynomials.  The  existenceof  such  a  recursion  formula  should 
come  as  no  surpr^e;  in  fact,  from  the  diwossion  in  the  previous  section,  it  should 
be  obvious  that  the  desired  recursion  must  be  equivalent  to  the  Levinson  recursion. 

To  derive  this  new  version  of  the  recursion,  substitute  equation  (72)  into  the  Levinson 
recursion,  equation  (23)  with  the  result 


°„+i  <z>  - 


k'1  z  O  (z)  -  K  k'1  z"<f>  (z-1) 
n  n  n  n  nv  ' 


ll 


■2  th 

Next  the  fact  that  is  the  mean  squared  error  at  the  n  n  stage  coupled  with 

equation  (30)  yields  the  final  recursion  formula 

<WZ>  -  «<>  |zlVz>  •  K,/*„  <Z'‘>1  <75> 

The  Kn's  appearing  in  equation  (75)  are  still  given  by  equation  (21)  where 
now 


R 


w(z)  df  . 


(76) 


Conclusion 

The  basic  mathematics  relatltg  to  the  linear  predictive  filtering  approach 
to  speech  analysis/synthesis  has  now  been  presented.  The  analysis  began  by 

X. 

postulating  that  speech  is  produced  by  exciting  an  all-pole  filter  with  either  a  uniform 
impulse  train  or  white  noise.  A  minimum  mean-squared  error  technique  for 
estimating  the  parameters  of  an  all-pulse  filter  from  a  segment  of  speech  data 
was  then  introduced  and  an  explicit  expression  for  this  filter  in  terms  of  the 
speech  data  was  derived. 

Next,  a  numerical  attractive  recursive  technique  for  computing  this  filter 
was  derived  and  it  was  shown  that  this  filter  must  always  be  stable.  This  filter 
can  be  realized  !r.  a  variety  of  ways  such  as  direct  form,  cascade  form,  and  in  addition, 
it  vjls  demonstrated  that  it  also  can  be  realized  as  a  non-uniform  acoustic  tube. 

The  reflection  coefficients  defining  this  tube  are  generated  as  a  matter  of  course 
when  computing  the  filter  by  means  of  the  recursive  technique  just  mentioned. 
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